Finding and identifying text in 900+ languages

نویسندگان

چکیده

برای دانلود باید عضویت طلایی داشته باشید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Finding and Identifying Text in 900+ Languages

This paper presents a trainable open-source utility to extract text from arbitrary data files and disk images which uses language models to automatically detect character encodings prior to extracting strings and for automatic language identification and filtering of non-textual strings after extraction. With a test set containing 923 languages, consisting of strings of at most 65 characters, a...

متن کامل

identifying the strategies persian efl learners use in reading an expository text in english and examining its relation to reading-proficiency and motivation: a think-aloud study

هدف اصلی از این مطالعه بررسی نوع و میزان استراتژی هایی بود که دانشجویان فارسی زبان رشته ی زبان انگلیسی در حین خواندن یک متن انگلیسی به کار گرفتند. این مطالعه همچنین به بررسی تفاوت های استراتژی های مورد استفاده بین دارندگان سطح بالا و پایین درک مطلب پرداخت. نوع همبستگی بین استراتژی به کار گرفته و درک مطلب از یک سو و استراتژی به کار گرفته و انگیزه از سوی دیگر نیز در این تحقیق مورد آزمایش قرار گرف...

15 صفحه اول

Finding Contradictions in Text

In this paper, I seek to understand the ways contradictions occur across texts and I describe a system for automatically detecting such constructions. Finding conflicting statements is foundational for text understanding, a problem which recently received a surge of interest in the computational linguistics community. Condoravdi et al. (2003) first recognized the importance of handling both ent...

متن کامل

Keynote Lecture 2: Text Analysis for identifying Entities and their mentions in Indian languages

The talk deals with the analysis of text at syntactic-semantic level to identify a common feature set which can work across various Indian languages for recognizing named entities and their mentions. The development of corpora and the method adopted to develop each module is discussed. The talk includes the evaluation of the common feature set using a statistical method which gives acceptable l...

متن کامل

Finding Sequences in Pattern Languages

This focus group was an experiment. We wanted to know what went on in people’s minds when they constructed sequences from an existing pattern language; i.e. how they selected patterns when attempting to solve a particular design problem. In this case we used the WU pattern language, which focuses on web usability (Graham, 2003). Ian Graham presented it with four of its patterns at EuroPLoP 2002...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: Digital Investigation

سال: 2012

ISSN: 1742-2876

DOI: 10.1016/j.diin.2012.05.004